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Abstract 

Modeling individual heterogeneity in capture probabilities has been one of the 
most challenging tasks in capture-recapture studies. Heterogeneity in capture 
probabilities can be modeled as a function of individual covariates, but correla- 
tion structure among capture occasions should be taking into account. A pro- 
posed generalized estimating equations (GEE) and generalized linear mixed 
modeling (GLMM) approaches can be used to estimate capture probabilities 
and population size for capture-recapture closed population models. An exam- 
ple is used for an illustrative application and for comparison with currently 
used methodology. A simulation study is also conducted to show the perfor- 
mance of the estimation procedures. Our simulation results show that the pro- 
posed quasi-likelihood based on GEE approach provides lower SE than partial 
likelihood based on either generalized linear models (GLM) or GLMM 
approaches for estimating population size in a closed capture-recapture experi- 
ment. Estimator performance is good if a large proportion of individuals are 
captured. For cases where only a small proportion of individuals are captured, 
the estimates become unstable, but the GEE approach outperforms the other 
methods. 
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Introduction 

Many estimation methods have been developed for the 
analysis of closed population capture-recapture data. For 
comprehensive material on the subject see, for instance, 
Otis et al. (1978), Seber (2002), Wilhams et al. (2002) 
and Amstrup et al. (2005). The most general capture- 
recapture closed population model, considered by Otis 
et al. (1978) was denoted by M,bh where (h) is used to 
denote inherent individual heterogeneity, (t) time effect, 
and (b) behavioral response to capture. In this work, we 
are interested in estimating the population size and SE of 
a submodel of the type Mj,, where individual heterogene- 
ity can be modeled as a function of covariates. Develop- 
ment of capture-recapture models dealing with individual 
heterogeneity in capture probabilities has been one of the 
most challenging tasks. Failure to account for such 



heterogeneity has long been known to cause substantial 
bias in population estimates (Otis et al. 1978; Lee and 
Chao 1994; Hwang and Huggins 2005). Moreover, Link 
(2003) showed that without strong assumptions on the 
underlying distribution, estimates of population size 
under model Mh are fundamentally nonidentifiable. 

The use of covariates (or auxiliary variables), if avail- 
able, has been proposed as an alternative way to partially 
cope with the problem of heterogeneous capture proba- 
bilities (Pollock et al. 1984; Huggins 1989; Alho 1990). 
The idea is to model capture probabilities as a function 
of individual (i.e., age, sex, and weight) and environmen- 
tal (i.e., temperature, rainfall, and location) covariates, 
using a generalized linear modeling (GLM) approach, 
such as logistic regression. The method of Huggins (1989, 
1991), based on a conditional likelihood to estimate pop- 
ulation size, has become very popular, but it assumes 
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independence among capture occasions (Huggins and 
Hwang 2011). 

In the analysis of capture-recapture data, Hwang and 
Huggins (2005) and Zhang (2012) examined the effect of 
heterogeneity on the estimation of population size by solv- 
ing estimating equations, but these authors also assumed 
independence of capture occasions. Capture-recapture 
data are collected on the same individuals across successive 
capture occasions. One may view capture-recapture data 
as binary longitudinal or repeated measurements data 
(Huggins and Yip 2001). These repeated observations are 
often correlated over time. This dependency or correlation 
structure may be induced by incorporating individual het- 
erogeneity. Failure to account for this dependency may 
provide biased estimates. Hwang and Huggins (2007) also 
state that the assumption of independence among capture 
occasions is often violated in practice, but the authors still 
rely on the assumption. Some dependencies among capture 
occasions can be dealt with through the modeling of 
behaviorally effects, such as trap happy and trap shy effects, 
which are treated as special cases in the capture-recapture 
literature (Yang and Chao 2005; Pradel and Sanz-Aguilar 
2012). One alternative approach is to use a generalized esti- 
mating equations (GEE) to account for a working correla- 
tion structure among capture occasions (Liang and Zeger 
1986) and use observed individual characteristics to model 
heterogeneity in capture probabilities. A mixed effects 
modeling approach may also be used to model heterogene- 
ity of individual observed and unobserved characteristics 
in capture-recapture experiments motivating the use of 
generalized linear mixed models (GLMM) (Pinheiro and 
Bates 2000). Some authors have previously introduced the 
use of GLMM (logit models with normal random effects) 
(e.g., CouU and Agresti 1999; Stoklosa et al. 2011). An 
advantage of using GLMM for the estimation of capture 
probabilities is to accommodate not only the heterogeneity 
attributed to individual characteristics, but also the hetero- 
geneity that cannot be explained by the observed individual 
characteristics. 

Bayesian methods have also become popular in capture- 
recapture studies. An extensive Bayesian literature on 
capture-recapture closed population models includes Cas- 
tledine (1981), Smith (1991), George and Robert (1992), 
Madigan and York (1997), Basu and Ebrahimi (2001), 
Ghosh and Norris (2005), King and Brooks (2008), and 
Gosky and Ghosh (2009, 2011). Bayesian statistical model- 
ing requires the development of the likelihood function of 
the observed data, given a set of parameters, as well as the 
joint prior distribution of all model parameters. Bayesian 
methods allow for estimation of the unobserved random 
effects as well, but the performance of their estimates often 
depends on the chosen prior distributions. Often, the 
method of selecting prior distributions is subjective (Lee 



GEE and GLMM to Estimate Capture Probabilities 

et al. 2003). A possible advantage of GEE over random- 
effects models and Bayesian methods relates to the ability 
of GEE to allow specific correlation structures to be 
assumed between capture occasions. 

Here, we propose a GEE approach for estimating cap- 
ture probabilities and population size in capture-recap- 
ture closed population studies. We also compare the 
results of population size estimates and their SE, when 
using the two estimation methodologies (i.e., GEE and 
GLMM). For illustrative purposes, we analyze a real data 
set that has already been discussed in the literature. Con- 
ditional arguments are used to obtain a Horvitz-Thomp- 
son-like estimator for estimating population size. A 
simulation study is also conducted to compare the perfor- 
mance of the estimation procedures. In the next section, 
we describe the notation and models that are used to esti- 
mate capture probabilities and population size. 

Notation and Models 

Consider a population consisting of N animals in a cap- 
ture-recapture experiment over m capture occasions, 
j = 1,2,. . .,m. Let Yij be a binary outcome, equaling 1 if 
the ith animal is being caught on the jth capture occasion 
and 0 otherwise. Let Y,- = (Y,i,Yi2,. . .,Y,„,) be a random 
vector with the capture history of individual i. Let 
T; = Yl'j=i ^ij number of times the ith animal has 

been caught in the course of the trapping closed popula- 
tion study. Let f,- be the time the ith individual is first 
captured. Heterogeneity in captured probabilities is often 
explained by observed individual covariate X;, such as age, 
sex, weight. For simplicity, we consider x,- a single covari- 
ate, but the model can be easily generalized for x; to be 
considered a vector of covariates. Let the probability that 
the z'th animal is captured on any trapping occasion j, be 

PiiP) =Pr(Y„ = 1|X,) = h{Xifi);i = 1,2, . . .,N, 
= 1,2, . . .,m 

where 



is the design matrix, = (j6o,/^i) is the vector of parame- 
ters associated with the covariates, and h{u) = (H-exp 
(— m))^' is the logistic function. This is an Mj, model 
where variation in capture probabilities among individu- 
als is explained by the covariate X;. The probability of not 
capturing the ith individual on the jth occasion is 
(l—piil^)), and the variance of Yij is piiP)il—pi(P)) (Liang 
and Zeger 1986). Then, Ti^^ Binim,pi{P)) and iZiiP) = 1- 
(l—piiP))'" is the probability of individual i being 
captured at least once, given the covariate x,-. Let the set 
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of distinct individuals captured at least in one occasion 
be indexed by i = 1,2,. . .,n and uncaptured individuals 
would be indexed by z = n + 1,. . .,N without loss of gen- 
erality. To estimate the population size, once an estimate 
of ji is obtained ifi), the Horvitz-Thompson estimator 
N = ^/{Tii$)) may be used as in Huggins (1989). 

Generalized estimating equations approach 

Let V; = A]^^Ri{(£)A\^^ be the covariance matrix of 7,-, 
where. A; = diag[Var(yii)>Var(y;2),. . .,Var(yi„,)] is a 
mxm diagonal matrix and J?;(a) is known as the working 
correlation structure among F;!,^,^,. ■ •,i';n, to describe the 
average dependency of individuals being captured from 
occasion to occasion. A GEE approach permits several 
types of working correlation structure -R,(o;) (for details, 
see Diggle et al. 1994). For the description that follows, 
and for simplicity, we consider an independence working 
correlation structure, -R,(a) = I where I is an identity 
matrix. The covariate X; is never known for the individu- 
als that have not been captured. Therefore, 7^ is condi- 
tional on the captured individuals (n) (i.e., T; > 1) with 
the corresponding observed individual covariates similar 
to Huggins (1989) and Zhang (2012). The probability 
that the ith individual is captured on the jth occasion 
(py) given that the ith individual is observed at least once 
is, Pr(Y^ = l|r,> 1) =p,;,/(l - ni'Li(l -P*))- Let 
liij = E {Yij\Ti > 1) = py/(l - YVL,{\ - pft))> and D,- be 
the matrix of derivatives c)f.Lildp , where iij = (/,(;i,/.(i2,. • ., 
liim) , hence D; = AiXj. The variance Vy of Yy given T,- > 1 

is = var(y^|r, > 1) = p.j{i-pij - rcLi(i 

— pi]t)Y ■ Considering, V,- = diag(vy), an estimator 
of P can be obtained by solving the following generalized 
estimating equations: 



var(N) = m~\^ - f^m) + m'm-'m 

where Tifi) represents an estimate of the conditional 
information matrix for /J and A(/J) is the vector 
^"^j UiifiY^ duiifi) / dfi. If the individual capture probabil- 
ity does not depend on time, previous capture history, or 
any covariate, then the model (1) simplifies to piiP) = h 
iPo) = Po> which is a reparameterization of model Mq of 
Otis et al. (1978) (see Huggins 1991; Hwang and Huggins 
2005). This model assumes all the individuals have equal 
capture probabilities. Then, the estimating equations for 
Po is simplified to 



mpo 



I-(I-Po)" 



(4) 



Let j]g be the resulting estimator of Po then 

Tio = 1 - (1 - po)"' where po = KK)- 

iVIethods based on a partial likelihood 

The fuU likelihood of all model parameters is propor- 
tional to 



fr P.W"'{l-P,(/?)}"""' 



n^'W n (5) 



As the number of total individuals, N, is unknown and 
the covariates are not known for individuals that are 
never captured, this likelihood cannot be directly evalu- 
ated. The conditional likelihood (Huggins 1989) is the 
first product component, and it can be formulated as a 
GLM (Huggins and Hwang 2011) for the positive Bino- 
mial distribution (Patil 1962). It may be rewritten as 



(2) 



Y{Pi{iiV'-'{i-Pim'"-'-^''-'^ 

i= 1 

{i-Pim''-'pm' 



n 



(6) 



If covariate x,- (i = 1,2,. . .,«) is available for captured 
individuals, then the model becomes p,(/?) = h{XiP). This 
model is not equivalent to any of those discussed in Otis 
et al. (1978), rather this model is a restricted version of 
their model Mh (Huggins 1991). If piiP) = HXtP), then 
following Zhang (2012), estimating equations (2) can be 
simplified to 



E 

i= 1 
n 

E 



i-{i-pm)'"--Mm-p.{p)r 

i-ii-pm'"-"'pmi-pim'" 
i-{i-p.m"-' 



[Ti- 



mpm 



= 0 



"pM w. - 0. 



(3) 

For a given then 7t;(/?) = 1 — (1 — piip))" and an 
estimate of the variance of N is given by 



When the full likelihood is partitioned into a product 
of conditional densities, then a partial likelihood (Cox 
1975) may arise considering some of the product terms, 
but it involves only the parameters of interest, isolating 
the nuisance parameters. Therefore, the partial likelihood, 
PL(/?), is the first product of the equation (6), which is 
the likelihood of the number of recaptures after the first 
capture (Stoklosa et al. 2011). For a given t;, (T,- — 1)| 
Bin{m—ti,pi{P)), which is used to estimate the para- 
meters p. 

To utilize a simple GLMM with a random effect, we 
suppose that pi(P) = h{XiP + OyZj) where z; is a realiza- 
tion of the standard normal random variable 
Zi^^Af (0,1), with (7i,>0. The use of random effects 
reflects the belief that there is heterogeneity that cannot 
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be explained by covariates. The partial likelihood can be 
considered as the joint distribution of the response and 
the random effects. To estimate P and ay, the marginal 
likelihood of the response is obtained by integrating out 
the random effects. The integration can be approximated 
by penalized quasi-likelihood (Breslow and Clayton 
1993), which enables parameter estimation via an iterative 
procedure. 

The variance of N for a smoothing parameter a may 
be estimated according to Stoklosa et al. (2011) using the 
foUowing formula, V^(N,;.) = E;=i(l - 
+ {Xi»7(iS)}'Var(jS){Xif7(j8)}, where i^iP) is a vector with 
r]i(P) = 7i,(/?)^^m|);(/?){l— 7t,(/?)}, and all quantities are 
evaluated at fi. The smoothing parameter 1, which is part 
of the quasi-likelihood procedure, controls the degree of 
roughness of the estimated functions. To obtain an opti- 
mal value for 1, we used generalized cross-validation 
(GCV) technique (Wood 2006). 



Table 1. Comparison of parameter estimates (SE in parentlnesis) for 
least cliipmunk data after fitting models with and witlnout a covariate 
(sex). 



Model no. 



logit!p,(/?)| 



W 



Intercept-only models 

1. PLGLM -0.82(0.18) 

2. QLGEE -0.73(0.13) 

3. PL GLMM -0.85 (0.26) - 
Linear covariate models 

4. PL GLM -0.81 (0.25) 



0.00 z,- (0.73) 



0.03 sex (0.37) 



5. QL GEE -0.84 (0.18) - 0.21 sex (0.25) 

6. PL GLMM -0.83 (0.34) - 0.14 sex 

(0.49)+ 1.59 z,- (0.00) 



50.72 (3.33) 
49.56 (2.27) 

50.73 (3.35) 

50.73 (3.35) 
52.40 (2.94) 
74.16 (12.06) 



A realization of the standard normal random variable Z,~AA(0, 1) is 
Z/. Numbers in this table are rounded to two decimal places; there- 
fore, 0.00 does not mean zero. 

QL, quasi-likelihood; PL, partial likelihood; GLM, generalized linear 
models; GEE, generalized estimating equations; GLMM, generalized 
linear mixed models. 



Application 

We applied the techniques discussed in the previous Sec- 
tion to a data set of least chipmunks [Eutamias minimus) 
made available by V. Reid (1975). The data set has been 
previously analyzed and discussed by Otis et al. (1978) 
and Wang et al. (2007). V. Reid laid out a 9 x 11 live- 
trapping grid with traps spaced 50 feet (15.2 m) apart. 
The study was conducted in an area dominated by sage- 
brush and snowberry in Colorado, USA. The numbers of 
animals caught for six occasions {rii to fjg) were 7, 15, 16, 
24, 19, 7, and X"*: = 88. Of these 88 captures, n = 45 dis- 
tinct animals were captured, and the covariate sex (male 
or female) was collected for each captured individual; 
there were 22 males and 23 females. The recorded capture 
frequencies (/i to f^) were 21, 12, 7, 3, 2, 0. The average 
capture frequencies for male and female were 1.86 and 
2.04, respectively. Our estimation results are summarized 
in Table 1. The inclusion of the covariate sex does not 
improve our estimates of population size which are very 
similar, except when the random effect is considered in 
the GLMM, which is based on partial likelihood estima- 
tion. This may indicated that there is unmodeled individ- 
ual heterogeneity in capture probabilities that is not being 
accounted for with the other models (GLM and GEE). 
The population estimate, in this case, is approximately 74 
individuals with a SE of 12. Both values are quite high 
when compared to the values obtained with the other 
estimation strategies. Although, GLMM accounts for het- 
erogeneity due to unobserved individual characteristics, it 
may also be overestimating population size at the 
expenses of greater loss in precision, possibly due to the 
increase in the number of model parameters that are esti- 
mated. In contrast, quasi-likelihood GEE methodology 



provided lower SE, when compared to results from the 
Bayesian approach of Wang et al. (2007) for the same 
data set. The latter authors estimated population size of 
50 with a SE of 3.14. The GEE estimation results also 
agree with Otis et al. (1978), but our model jointly takes 
into account heterogeneity in capture probabilities and 
correlation among capture occasions. 

Simulation Study 

A simulation study was conducted in order to evaluate 
the performance of the estimators. The effect of heteroge- 
neity among observed individuals was modeled using two 
covariates, sex (male = 1 and female = 0), and weight. 
Two levels of population sizes N = 100 and 500 and two 
levels of capture occasions m = 6 and 10 were considered. 
For each individual, we assigned sex with probability 0.5 
from a Bernoulli distribution and weight from a normal 
distribution with mean 15 and variance 4. These values 
are based on the previous data analysis. Individual cap- 
ture probabilities were modeled with a logistic regression, 
so that 

„/?„+/?, X sex,+fe X weight, 
^ /y\ 

1 + e/io+/ii xsexi+fe xweight ' 

where /^o is the constant term. Pi and /?2 represent the sex 
and weight effects, respectively. A positive /?i implies that 
the sex taking value 1 is more catchable, and a positive P2 
means that the catchability increases with weight. We 
considered three different simulation scenarios for capture 
probabilities: (a) high capture probabilities (/Jq = —3.5); 
(b) medium capture probabilities (/?o = —4.0); (c) low 
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capture probabilities (/?o = —4.5); and their averaged are 
presented in Table 2. In addition, a Gaussian random 
effect with mean 0 and dy = 0.1 was included as an 
unobserved covariate to ensure the existence of heteroge- 
neity due to unobserved individual characteristics. For 
each simulation scenario, GLM, GEE, and GLMM 
approaches were used for data analyses and to assess esti- 
mators performances. The simulation study was carried 
out with 1000 Monte Carlo replicates. 

To evaluate estimators' performance, we present the SE, 
the relative bias (PRB), the root mean square error 
(RMSE), the coefficient of variation (CV), and confidence 
interval coverage (%) (GOV) for the estimates of popula- 
tion size. The simulation results for six capture occasions 



are given in Table 3. We noticed that all estimation proce- 
dures for scenario (a) perform well. There was little bias, 
low SE, low coefficient of variation for N. In this scenario, 
confidence interval coverage for all estimators is very good 
(93-96%), considering a nominal level of 95%. As in our 
example, the exception is the GLMM that tends to overesti- 
mate population size. Overestimation is particularly severe 
when capture probabilities are low, see for instance, results 
of scenarios (b) and (c). Gonfidence interval coverage for 
GLMM is also poor (77-90%) in these scenarios. For all 
scenarios, the GEE approach performs well when estimat- 
ing population size. This approach also consistently pro- 
vides lower SE and lower RMSE when compared to GLM 
and GLMM estimators, although the differences are mini- 



Table 2. Simulated capture probability scenarios for tlie capture probability model, logit(p/) = /io+/?i x sex + [ij x weight, p represents average 
capture probability when weight = 15 and r, represents the average probability of an individual being captured at least once during the study. 



Effects of covariates p m = 6 m = 10 

Simulation scenarios /?o /?i Pi Male Female Male Female Male Female 

(a) High -3.5 0.1 0.2 0.40 0.38 0.95 0.94 0.98 0.98 

(b) Medium -4.0 0.1 0.2 0.29 0.27 0.87 0.85 0.94 0.92 

(c) Low -4.5 0.1 0.2 0.20 0.18 0.73 0.70 0.83 0.80 



Table 3. Simulation results (1000 repetitions) considering m = 6 trapping occasions. 





W 


n 


AVE{W) 


SE(W) 


PRB 


CV 


RMSE 


COV 


(a) High 


















PL GLM 


100 


92 


100.63 


3.77 


0.63 


3.75 


3.82 


94.5 


QL GEE 


100 


92 


100.66 


2.90 


0.66 


2.88 


2.97 


95.8 


PL GLMM 


100 


92 


101.81 


4.30 


1.81 


4.22 


4.67 


95.9 


PL GLM 


500 


460 


500.65 


7.97 


0.13 


1.59 


8.00 


93.2 


QL GEE 


500 


460 


500.87 


6.28 


0.17 


1.25 


6.34 


95.3 


PL GLMM 


500 


460 


506.56 


9.07 


1.31 


1.79 


11.20 


93.1 


(b) Medium 


















PL GLM 


100 


84 


101.56 


7.16 


1.56 


7.05 


7.33 


94.3 


QL GEE 


100 


84 


101.51 


4.82 


1.51 


4.75 


5.05 


95.2 


PL GLMM 


100 


84 


106.58 


9.06 


6.58 


8.50 


11.20 


89.1 


PL GLM 


500 


421 


501.74 


14.89 


0.35 


2.97 


15.00 


94.6 


QL GEE 


500 


421 


501.92 


10.31 


0.38 


2.05 


10.50 


95.2 


PL GLMM 


500 


421 


526.33 


18.90 


5.27 


3.59 


32.40 


83.3 


(c) Low 


















PL GLM 


100 


69 


104.61 


14.01 


4.61 


13.40 


14.80 


95.7 


QL GEE 


100 


69 


103.53 


7.48 


3.53 


7.22 


8.27 


94.6 


PL GLMM 


100 


69 


131.07 


21.14 


31.07 


16.10 


37.60 


77.2 


PL GLM 


500 


356 


504.24 


26.68 


0.85 


5.29 


27.00 


95.0 


QL GEE 


500 


356 


503.86 


15.45 


0.77 


3.07 


15.90 


94.5 


PL GLMM 


500 


356 


576.72 


37.06 


15.34 


6.43 


85.20 


77.4 



Averages of the numbers of captured individuals, (n); the estimates of population size, AVE(W); SE of the estimated population size, SE(/V): per- 
centage relative bias, PRB = 100 x (E(A/)-A/)-^N, where E(A/) is estimated by AVE(A/); root mean square error, RMSE = ^\/ar{N) + Bias^; 
percentage coefficient of variation, CV = 100 x SE(A/) ~ £{N) and confidence interval coverage (%), CQV. 

QL, quasi-likelihood; PL, partial likelihood; GLM, generalized linear models; GEE, generalized estimating equations; GLMM, generalized linear 
mixed models. 
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mal for GEE-GLM comparisons. Therefore, our simulation 
results indicate that the general performance of estimators 
obtained from GEE is better than GLM and GLMM. The 
GEE approach may overcome the effect of random effects 
due to its ability accounting for the correlation structure 
among capture occasions. The simulation results for 10 
capture occasions are presented in Table 4. The perfor- 
mance of estimators for 10 capture occasions is better than 
for six capture occasions yielding lower CV, absolute value 
of PRB, RMSE, but higher GOV. This is generally true 
because the average capture probability is higher for 10 
capture occasions than for six capture occasions. We also 
conducted simulations for two other levels of N (50 and 
200) when m = 6 and 10. These results are similar to the 
ones presented here. 

Discussion 

Individual heterogeneity and time dependence are funda- 
mentally important in real-life applications of capture- 
recapture studies. The main purpose of this study was to 
compare estimates of population size and their SE using 
statistical techniques such as, quasi-likelihood for GEE and 
partial likelihood for GLM and GLMM. We also present a 
GEE approach that permits capture-recapture data analysis 



using individual covariates that accounts for heterogeneity 
in capture probabilities and for correlation among capture 
occasions. Evaluating the pattern of time dependency is 
important in several regards: (i) it may help characterize 
the relationship between the capture probability and cova- 
riates and (ii) it is also important to estimate the popula- 
tion parameters accurately in the capture-recapture 
studies. A natural question that arise is "what happens if 
one ignores the time dependency and uses the traditional 
regression methodology assuming independence among 
capture occasions?" From a statistical point of view, there 
are at least two consequences of ignoring time dependency: 
incorrect assessment of the regression estimates and ineffi- 
cient estimation of regression coefficients. Therefore, esti- 
mated capture probabilities may be incorrect and 
consequently population size may not be accurately esti- 
mated if time dependency is ignored. The quasi-likelihood 
GEE approach seems to perform better than GLM and 
GLMM approaches because the SE of the estimated popula- 
tion size are consistently lower. The estimators perform 
well when average capture probabilities are high, but it is 
hard to obtain reliable estimates of GLMM approach for 
low capture probabilities. However, other existing methods 
in capture-recapture studies allowdng for heterogeneity 
have similar problems (Nichols and Pollock 1983; Nichols 



Table 4 . Simulation results (1000 repetitions) considering m = 10 trapping occasions. 





W 


n 


AVE(W) 


sm 


PRB 


CV 


RMSE 


CQV 


(a) High 


















PL GLM 


100 


98 


100.11 


1.43 


0.11 


1.43 


1.43 


94.3 


QL GEE 


100 


98 


100.14 


1.36 


0.14 


1.35 


1.36 


96.3 


PL GLMM 


100 


98 


100.15 


1.45 


0.15 


1.44 


1.45 


94.6 


PL GLM 


500 


492 


500.20 


3.11 


0.04 


0.62 


3.11 


95.1 


QL GEE 


500 


492 


500.18 


3.03 


0.04 


0.61 


3.03 


96.2 


PL GLMM 


500 


492 


500.28 


3.19 


0.06 


0.64 


3.20 


94.9 


(b) Medium 


















PL GLM 


100 


95 


100.47 


3.14 


0.47 


3.12 


3.17 


95.2 


QL GEE 


100 


95 


100.42 


2.98 


0.42 


2.97 


3.01 


96.5 


PL GLMM 


100 


95 


100.92 


3.32 


0.92 


3.29 


3.45 


93.4 


PL GLM 


500 


473 


500.76 


6.71 


0.15 


1.34 


6.75 


94.6 


QL GEE 


500 


473 


500.66 


6.35 


0.13 


1.27 


6.38 


96.1 


PL GLMM 


500 


473 


502.03 


7.20 


0.41 


1.43 


7.48 


94.1 


(c) Low 


















PL GLM 


100 


86 


101.25 


6.18 


1.25 


6.11 


6.31 


96.4 


QL GEE 


100 


86 


101.31 


5.97 


1.31 


5.89 


6.11 


94.2 


PL GLMM 


100 


86 


104.71 


7.35 


4.71 


7.02 


8.73 


88.6 


PL GLM 


500 


431 


500.98 


13.04 


0.20 


2.60 


13.08 


95.0 


QL GEE 


500 


431 


500.65 


12.57 


0.13 


2.51 


12.58 


95.4 


PL GLMM 


500 


431 


512.15 


15.21 


2.43 


2.97 


19.46 


88.7 



Averages of tine numbers of captured individuals, (n); tine estimates of population size, AVE(/\/); SE of the estimated population size, SE(A/): per- 
centage relative bias, PRB = 100 x (E(W) - N) ~ N, where E(A/) is estimated by AVE (/V); root mean square error, RMSE = ^\/ar{N) + Bias^; 
percentage coefficient of variation, CV = 100 x SE(W) ^ E(A/) and confidence interval coverage (%), CQV. 

QL, quasi-likelihood; PL, partial likelihood; GLM, generalized linear models; GEE, generalized estimating equations; GLMM, generalized linear 
models. 
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1986). For cases where only a small proportion of 
individuals are captured, the GEE approach provides better 
RMSE and is robust to violation of the assumption of inde- 
pendence among capture occasions. This approach also 
provides means of exploring factors thought to be responsi- 
ble for differences in capture probability among individu- 
als. Hence, it is important to account for correlation 
structure among capture occasions when estimating animal 
population parameters in capture-recapture studies. Future 
work could focus on expansion of the simulations to assess 
the performance of estimators based on GEE, GLMM, and 
Bayesian methods for capture-recapture studies. Exten- 
sions of this work to model Mjh may also be possible after 
imposing some parameter constraints. The GEE approach 
accounts for individual heterogeneity in capture probability 
as a function of covariates and correlation among capture 
occasions. It would be interesting if one can modify our 
proposed approach to additionally account for individual 
heterogeneity that cannot be explained by covariates. 
Researchers may also extend this approach for open popu- 
lation models to estimate unknown animal abundance in 
capture-recapture studies. 
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